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ABSTRACT 

Using mock observations of numerical simulations, we investigate the completeness and efficiency of 
searches for galaxy clusters in weak lensing surveys. While it is possible to search for high mass objects 
directly as density enhancements using weak lensing, we find that line-of-sight projection effects can be 
quite serious. For the search methods that we consider, to obtain high completeness requires acceptance 
of a very high false-positive rate. The false positive rate can be reduced only by significantly degrading 
the completeness. Both completeness and efficiency are dependent upon the filter used in the search and 
the desired mass threshold, emphasizing that a measurement of the 3D mass function from gravitational 
lensing is affected by a number of biases which mix 'cosmological' and observational issues. 

Subject headings: cosmology: theory — large-scale structure of universe — gravitational lensing 



1. INTRODUCTION 

Clusters of galaxies are one of our most important cos- 
mological probes. As the most recent objects to form 
in the universe their number density and properties are 
exquisitely sensitive to our modeling assumptions. Their 
composition accurately reflects the mix of matter in the 
universe. They are bright and can be "easily" seen to 
large distances, allowing constraints on the crucial interval 
< z ^ 1 where the universal expansion changes from de- 
celeration to acceleration. They are located close to their 
formation site. Being bright and sparse they are excel- 
lent tracers of the large-scale structure - they are highly 
biased so their clustering is easy to measure and is much 
more straightforwardly computed from theory than that 
of galaxies. 

For many years it has been realized that a large, ho- 
mogeneous, sample of clusters would impose particularly 
strong constraints on our model of structure formation, 
provided all clusters above a given mass threshold were 
included. In a r ecent manifestation of this idea Haiman, 
Mohr & Holder ( 2001 ) have suggested using the counts of 
clusters of galaxies above a certain mass threshold to probe 
the evolution of the dark energy believed to be causing the 
acceleration of the universal expansion. A more pedestrian 
goal would be to constrain the matter density and ampli- 
tude of the mass fluctu ations in the universe (e.g. Holder, 
Haiman & Mohr 2001 for a recent discussion). A study 
of the clustering of such samples could provide another 
measurement of the a ngular diameter distance-redshift re- 
lation (Cooray et al. 2001) or the linear growth rate of 
fluctuations. 

However, constructing large samples of galaxy clusters 
for statistical analyses remains a difficult task. For many 
years the only available catalogues were based on pro- 
jected galaxy overdensity, though it was quickly realized 
that such samples suffer from projection effects and the 
large scatter between optical richness and cluster mass 
(for recen t theo retical studies see e.g. van Haarl em, Frenk 
& White |1997|; Reblinsky & Bartelmann |l999| White & 
Kochanek 2001). At present cluster samples have been 
constructed using optical. X-ray and Sunyaev-Zel'dovich 
'surveys' each with their own advantages and disadvan- 



tages. 

In the last few years it has become possible to search for 
high mass objects directly as density enhancements using 
weak gravitational lensi ng. S ince the first detecti ons b y 
Bonnet, Mellier & Fort (|l994|) and Fahlman et al. (|1994D, 



this technique has been demonstrated r nany t imes on pre- 

a nd H attori 
. (|2000| ) and 



1998 



viousl y kno wn clusters (see Wu et al. 
et al. 1999| for recent reviews). Erben et a 
Umetsu & Futamase (200C) have reported 'dark clumps', 
where mass concentrations are seen with no optical or X- 
ray detection of a cluster. To date there is only one case of 
a confirmed cluster disco vered fi rst with weak gravitational 
lensing: Wittman et al. (2001) serendipitously detected a 
z = 0.276 cluster with ~ 600km/s in the corner of one 
of the fields they used previously for cosmic shear. 

Thus lensing offers a completely new way to find clus- 
ters. It has been claimed by some authors that weak lens- 
ing offers our first chance to construct a truly mass selected 
sample of clusters. As with all new methods, lensing avoids 
some of the problems which plague other methods but has 
its own systematics. In this paper we want to look at the 
power and pitfalls of this method. Specifically we want to 
look at how efficient and complete weak lensing surveys 
are at constructing a mass selected sample of clusters, as- 
suming ideal observing conditions. 

2. SIMULATED OBSERVATIONS 

2.1. N-body simulation 

We wish to make simulated convergence maps which are 
as realistic as possible while at the same time knowing the 
real 3D location of any clusters in the field. Our pro- 
gramme begins with a model for the evolution of the dark 
matter which governs the formation of large-scale struc- 
ture. On Mpc scales we expect that the baryonic mat- 
ter will faithfully trace the dark matter, thus our model 
should reproduce the spatial distribution of mass. This 
problem can be well tackled by modern numerical simula- 
tions. We have run a 512^ particle simulation of a ACDM 
model in a 30 0/i~^M pc box using the TreePM-SPH code 
(White et al. 2001) operating in coUisionless (dark mat- 
ter only) mode. This simulation represents a large cos- 
mological volume, to include a fair sample of rich clus- 
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Fig. 1. — (top left) A simulated k map, 5° on a side, with a linear greyscale which maps all pixels with k < to white, (top right) The 
same field with noise added, (bottom left) The noisy map, smoothed with a Gaussian of 1'. (bottom right) The Map map obtained from the 
noisy map with a filter of about 5' . 
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ters, while maintaining enough mass resolution to iden- 
tify galactic mass halos. Because it provides a reasonable 
fit to a wide range of observations, including the present 
day abun dance of rich clusters of galaxies (Pierpaoli, Scott 
& White 2001), we have simulated a ACDM cosmology 
with iln, = 0.3, f^A = 0.7, Hq = lOO/ikms-^Mpc"^ with 
h = 0.7, r^B = 0.04, n = 1 and (Tg = 1. The simulation 
was started at z = 60 and evolved to the present with the 
full phase space distribution dumped every 100ft.~^Mpc 
from z ~ 2 to z — 0. The gravitational force soften- 
ing was of a spline form, with a "Plummer-equivalent" 
softening length of 20/i~^kpc comoving. The particle 
mass is 1.7 x 10^°/i~^Mq allowing us to find bound ha- 
los with masses several times 10^^ h^^ Mq and giving sev- 
eral tens of thousands of particles in a cluster mass halo 
(> lQ^'^h~^ Mq) to begin to resolve substructure. 

For every output of the simulation we produce a halo 
catalogue by run ning a "friends-of- friends" group finder 
(e.g. Davis et al. 1985 ) with a hnking length 6 = 0.15 (in 
units of the mean interparticle spacing). This procedure 
partitions the particles into equivalence classes, by linking 
together all particle pairs separated by less than a distance 
b. We use 0.15 rather than the more canonical b — 0.2 
as we find that the larger linking length frequently merges 
what we would by eye list as separate halos. By effectively 
'removing' one of the halos from our list this would have a 
deleterious impact upon our efficiency calculations. While 
this effect is not entirely eliminated when using b = 0.15, 
it is significantly reduced. We keep all groups above 64 
particles, which imposes a minimum halo mass of order 
lQ^^h~^ AIq. For each identified halo we compute, using 
the 3D distribution of all of the particles in the simulation, 
the mass (we use M200, the mass enclosed within a radius, 
r'200, within which the mean density is 200 times the crit- 
ical density at that redshift), velocity dispersion etc. so 
we can understand our selection in terms of the intrinsic, 
rather than projected, cluster properties. 



where is the (comoving) angular diameter distance, 



dD 



da a'^H{a) 



(2) 



to a fiducial point beyond the furthest source, t — D/D^ 
is a dimensionless distance, 'w{t) is a weight function 



dn t(ts - t) 



(3) 



which reduces to t{l — t) if all of the sources are at a fixed 
distance D^, and 5 = p/ p — \. We approximate this inte- 
gral as a weighted sum of the projected mass S in each 
plane, with the projection being done parallel to the edges 
of the simulation volume. Studies by Hamana, Colombi & 



Mcllier (|2000|) have indicated that this method produces 
results which are almost identical to more numerically in- 
tensive ray tracing simulations. We have found that there 
is a slight positive bias in the power spectrum of k using 
this me thod as compared to the "tiling" method of White 
& Hu ( 200c ) which does not approximate the path with 
segments parallel to the box boundaries. This small bias 
will not affect any of our conclusions. 

We have chosen 100/i~^Mpc as our sampling interval be- 
cause it is large enough that edge effects are minimal even 
for rich clusters while being fine enough that line-of-sight 
integrals are well approximated by sums over the (static) 
outputs. However, even though only a small fraction of 
clusters lie within r2oo of a slice boundary, we decided to 
require that the orientation and offset change only on ev- 
ery third slice. Thus if we choose at one redshift the front 
of the box the next slice is required to be the middle and 
the next the back. In this manner the structure is continu- 
ous across 300/i"^Mpc distances, but still evolves in steps 
of lOO/i^^Mpc not 300/i^iMpc. 



2.2. The convergence map 

To make a weak lensing map we need to perform an in- 
tegral of the mass density (times a weight function) back 
along the line-of-sight to the redshift of the furthest source 
•Zs.max- Since the sources are at a cosmological distance, 
while our simulation volume is relatively small, we do this 
by "stacking" different slices through the box at e arlier 



and earlier output times (see Fig. 1 of da Silva et al. 2000 
for a diagram of this approach in another context) . Specif- 
ically we divide the simulation box at every output up into 
three pieces of 300 x 300 x lOO/i^^Mpc by trisecting in the 
line-of-sight dimension. A given observational field is then 
obtained by stepping back from z = to z = Zg^max choos- 
ing, every lOO/i^^Mpc along the line-of-sight, the density 
field from one third of the box (see below) at that output. 
To avoid repeatedly tracing through the same structure 
we shift the simulation volme perpendicular to the line-of- 
sight by a random amount and wrap using the periodicity 
of the simulation volume. All of the mass in that third 
of the box is projected onto the sky in this manner. The 
convergence is 



K = 2^mat(^o-D,) 



dt w{t) 
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Fig. 2. — The 'projected' mass function of clusters in our five 
5° x 5° simulated lensing fields. The squares indicate all clusters 
lying between the source and the observer in the fields, the triangles 
those for which the lensing kernel, t{l — t), is greater than half 
its peak value. We expect that extreme projection effects will be 
smaller for this restricted sample of clusters, since their weight in 
the lensing map is larger. 
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As a first step we have generated 5 maps, each 5° x 5° 
and 1024^ pixels, with sources fixed at = 1 (one ex- 
ample is shown in Fig. |^). A survey which is significantly 
shallower suffers a loss of lensing signal and becomes more 
sensitive to systematics related to intrinsic alignments of 
galaxies. A much deeper survey is extremely expensive 
of telescope time. Thus ~ 1 has been chosen by most 
workers in the field. By choosing all of our sources to lie at 
precisely = 1 we obtain a good first approximation to 
planned or existing surveys, while at the same time decou- 
pling any uncertainties in the source distribution from the 
effects we are concerned with in this work. Our maps are 
only approximately independent as they are drawn from 
the same simulation, but the random orientations allow us 
to sample different possible projection effects. For each 
map we can add Gaussian pixel noise at a specified level 
and we smooth the maps (using fourier transform meth- 
ods) with a Gaussian beam of a specified FWHM or con- 
volve the map with some other filter. 

2.3. The group catalog 

The same random slices and offsets are used to project 
the group catalogue into the field, and to produce 3D im- 
agesH of the field of view from the particle distribution 
(which we use to check for projection effects). We shall 
denote by 'cluster' any halo having M200 > 10^'^h~^MQ 
and our catalog includes all such halos lying in the field 
with z < 1. We define the center of a cluster as the po- 
sition of the potential minimum, calculating the potential 
using only the particles in the FoF group. This proved to 
be more robust than using the center of mass, as the poten- 
tial minimum coincided closely with the density maximum 
for all but the most disturbed clusters. The mass function 
of clusters lying between the source and observer in our 
5 simulated fields is shown in Fig. ||. The distribution of 
angular sizes is shown in Fig. |3[ 

^The particle distribution is visualized using Tipsy from the N- 
body shop at the University of Washington. 
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Fig. 3. — The distribution of projected 'virial' radii, 0200, for the 
clusters in our five 5° X 5° simulated lensing fields with Af200 > 
10^'^h~^ Mq. We define 6200 the angle subtended by r2oo a-t the 
distance of the cluster. 



2.4. Finding & matching peaks 

For a given (possibly noisy and smoothed) map we need 
an algorithm for finding peaks. We have chosen a simple 
strategy whereby we search the map for all pixels which 
are a local maximum (this is essentially the procedure used 
on real data). This set forms our base peak list, which 
we number. We then search around each maximum and 
include the adjacent pixels if k > /^max where / is a user 
specified fraction typically set to 0.7. The peaks are all 
extended at the same rate, so that adjacent peaks do not 
swallow each other. This algorithm then returns for every 
pixel in the map the peak number (or possibly "no peak" ) 
to which it belongs. For each peak we keep track of both 
the maximum k and the sum, Ksum, of the convergence in 
all pixels above the threshold fKmax- The latter is loosely 
correlated with mass. The exact extent of the peaks and 
the precise definition of Ksum will not affect our results. 

In addition to our 'peak list' we have our 'cluster list' 
from the 3D cluster catalogues. Given the very large num- 
ber of both peaks and clusters, and the lack of distance 
information in the peak list, matching these can be quite 
problematic. We perform the match in two directions: 
whether a peak is in the cluster list (forward match) and 
whether a cluster is in our peak list (backward match). 
We typically find that all clusters above IO^'^H'^Mq lie 
within ±1 pixel of an extended peak, but some clusters 
lie near peaks (local maxima) with very low values of k 
(see below). Because each of the lists is so long and we 
are only using 2D information in associating peaks to clus- 
ters, a 'match' is claimed only if the forward and backward 
associations agree. 

The code produces a list of peaks and halos with matches 
flagged (plus cases with only a forward or backward match). 
There are two key numbers which we shall focus on below. 
The first is the fraction of peaks which matched at least 
one halo, which will determine our 'efficiency'. The second 
is the fraction of halos which matched at least one peak, 
which will determine our 'completeness'. 
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Fig. 4. — The ratio of the "lensing mass" to the true mass for 
clusters above 10^'^h~^ Mq for our five 5° X 5° lensing fields. The 
solid histogram is all clusters and the dashed line is those for which 
w > lUmax/S (see text). 
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Note that as we begin to smooth the maps and add 
noise, the 1-1 correspondence between peaks and halos wih 
begin to degrade. We take the attitude that all potential 
detections would be followed up with e.g. X-ray observa- 
tions or redshifts. Thus if two halos match a single ex- 
tended peak we can count those both as detections since 
foUowup of that area of sky would presumably find both 
of them. 

Our method is very robust and can be easily automated, 
but it is not perfect. In particular it can cause us to under- 
estimate the efficiency of a lensing search due to the way 
we define our halos. For example, if two massive halos are 
close together (perhaps in the initial stages of a merger or 
interaction) they can be linked by our group finder into a 
single halo. Our group catalog will then contain param- 
eters only for the mass around the potential minimum, 
missing the other halo entirely. This halo still has a large 
amount of mass associated with it however, and is quite 
overdense, so it will likely correspond to a k peak. This 
peak will be erroneously counted as a miss as there is no 
corresponding entry in the cluster list. 

This effect is mitigated to a large extent by the rela- 
tively small linking length we have chosen to define our 
3D catalog. Neighboring halos are linked only if the ma- 
terial between them is ^ 10^ x overdense. We have found 
one example of such an artificial linking for systems of 
cluster mass, corresponding to a chain of halos lying along 
a filament. This single system changes our results by less 
than a percent. Inspection of other 'strange' peaks has not 
yielded any other examples of this effect. 

3. RESULTS 
3.1. Raw maps 

We first present results on the raw k maps, i.e. with- 
out adding noise or smoothing the map. These results will 
set a theoretical 'best' performance and allow us to under- 
stand how much things are degraded by noise and smooth- 
ing. Our results for the noise-free, unsmoothed maps are 
listed in Table |. We see that there are a very large num- 
ber of peaks, and that almost all clusters can be matched 
to peaks (we discuss the rare misses below) . If we apply a 
threshold to our peak finder to reduce the peaks to a rea- 
sonable number our completeness begins to drop rapidly. 

The reason for this can be seen by considering the dis- 
tribution of K values in the region of known clusters. We 





Peaks 




Clusters 






w > 


W > Wmax/2 


AU 


402154 


0.99 


0.99 


> 0.0 


288933 


0.98 


0.99 


> 0.2 


4712 


0.62 


0.81 


> 0.4 


629 


0.21 


0.28 



Table 1 

Peak statistics for the 'raw' maps, as a function of 
threshold kmax- the columns give the number of peaks 
above the thresholds listed, and the fraction of clusters 

(halos with A/200 > W^'^h~^ Mq) satisfying the lensing 
kernel cut which were found by matching to those peaks. 
There were 2415 and 1775 clusters with w > and 

■W > Ulmax/2 respectively. 



have chosen to do this in the context of cluster mass mea- 
surements, realizing that this issue has been discussed in 
many contexts previously. As a first step we went through 
our cluster list and for all clusters above IO^'^H'^Mq we 
summed the values of k within a disk centered on the 
known cluster center and with radius equal to the known 
value of r2oo- Using the cluster redshift we then converted 
this to a mass. A histogram of the mass compared to the 
true mass is shown in Fig. ^ where we see that there is 
both a large scatter, as has been emphasized before^, and 
a positive offset. Perhaps the most dramatic are the few 
clusters with a negative mass. These arise when a cluster 
in a region where the lensing kernel is small has a large 
void projected along the line-of-sight at a distance where 
the lensing kernel is large. This leads to a negative mean n 
even though there exists a significant mass overdensity. To 
isolate this effect we further require that the lensing kernel 
at the cluster position be more than half of its peak value. 
This largely removes the negative tail, although even with 
this cut 34 clusters (out of more than 1750) with < 
remain. 

The tail to very large mass ratios occurs when a massive 
cluster is projected on top of a less massive cluster. These 
line s-of-si ght were excluded in the analysis of Metzler et 
al. (1999), but we have not done so here. Thus our pro- 
cedure will return something close to the cumulative mass 
for each of these systems, which will be a large overesti- 
mate for the low mass system. We show an example of 
such an overlap in Fig. |[ Again, restricting ourselves to 
clusters near the peak of the lensing kernel reduces this ef- 
fect. We shall return to the projection effects on t he m ass 
measurement of the detected peaks at the end of § |3.3| . 

While the details change, this kind of effect is also seen 
in the distribution of peak k values or in other measures 
of 'mass' based on lensing. 

3.2. Adding noise 

While the above results can serve to indicate some of the 
pitfalls in weak lensing searches for clusters, they do not 
indicate how a realistic search would perform. To make 
further steps in this direction we need to include 'noise' in 
our maps. We will consider an 'ideal' or optimistic survey 
for the purposes of highlighting the cosmological, rather 
than observational, effects. Thus we shall neglect sources 
of noise intrinsic to the telescope or detector system and 
focus instead on the irreducible ' noise' due to galaxy prop- 
erties. Following van Waerbeke ( ^OOOj ) we model the noise 
in a K map that would arise from processing a shea r map 
using a technique such as that of Kaiser & Squires ( 1993 ) 
as Gaussian, correlated only by any smoothing kernel ap- 
plied to the map. If the mean intrinsic ellipticity of the 
source galaxies is 7int then the noise introduced in our n 
map has variance 



pix 



7int 



(4) 



where n is the mean number density of sources. Assuming 
7int = 0.2 (which is slightly optimistic for ground based 
wor k; Ka iser |199§| ; Crittenden, Natarajan, Pen & The- 



uns 2001) and n = 25 gal/arcmin (again slightly opti- 



mistic) we have tTpix — 0.14 for our 0.3' pixels. This level 



(1999) 



See for example the discussion in §7.3 of Metzler, White & Loken 
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Fig. 5. — A zoom in of one of our convergence maps showing two clusters which lie almost on top of each other in projection, (left) the 
full K map. (middle) the portion coming from the 100/i~^Mpc slice at low redshift. (right) the portion coming from the 100/i~^Mpc slice at 
higher redshift. The regions shown are 0.3° on a side. 



of noise is quite optimistic, compared to current observa- 
tions, which will make our conclusions conservative. None 
of our main results will depend on our specific choice of 
CTpix, though the absolute meaning of our signal-to- noise 
cuts will clearly scale with CTpix- 

Even this level of noise is quite large compared to our 
signal, so we need to smooth the maps to enhance the 
contrast of our signal to the noise. We have not at- 
tempted to search for an 'optimal' filter, matched to the 
predicted shape of the cluster. Instead we have chosen to 
either smooth the maps with a Gaussian whose FWHM is 
roughly matched to the size of clusters at cosmological dis- 
tances, 1' — 2', or to apply the M^p filter already discussed 
in the literature (see § p^ . 

The top panel of Fig. g shows how a survey's complete- 
ness and efficiency depend on the smoothing scale in the 
case of no noise. As expected increasing the smoothing de- 
creases the number of false positives and thus the expense 
of follow up observations; but it also decreases the sur- 
vey completeness. The best match of peak threshold and 
smoothing will depend on the trade offs between these two 
issues. The trade off is also affected by the level of noise 
in the map, as is shown in the bottom panel of Fig. ^ 
Note that for no combination of parameters is it possible 
to have > 50% completeness with < 50% contamination! 
This may be traced primarily to the large scatter in the 
mass-peak relation shown in Fig. ^ (for a discussion of pre- 
cisely this eff ect in redshift surveys for clusters, see White 
& Kochanek pOOll ). 

We note that only for very high thresholds is our ef- 
ficiency close to 100%. This means that there are quite 
prominent peaks in the maps which do not match any 
cluster with Alyno > lQ^'^h~^M(n. This could have rele- 



vance to the ques tion of 'dark clusters' (sec F ischer 1999 
Erben et al. ^00G| , Umetsu & Futamase ^OOO]) . 



3.3. Matched filter 

One can enhance the signal-to-noise for cluster-like struc- 
tures by convolving the k map with a 'matched filter'. For 
a certain class of such filters, known as aper ture mass mea- 
sures (Schneider 1996 ; Schneider et al. 1998 ) this operation 
is easy to implement directly on the shear field itself: con- 
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Fig. 6. — Completeness and efficiency as a function of smooth- 
ing for maps with (top) no noise and (bottom) noise of rms 0.14 
per pixel. Solid lines are the fraction of clusters (groups with 
M200 > W^'^h~^ Mq) in the 5 fields which are matched (com- 
pleteness), dashed lines are the fraction of identified peaks which 
correspond to clusters (efficiency). The symbol types denote the 
ft threshold for counting peaks: (squares) Kmax > 0, (triangles) 

Kmax > 0.1, (circles) Kniax > 0.2. 
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volution of the k, map with a kernel U can be shown to 
be the same as convolving the tangential shear map with 
a related kernel Q provided the kernel U vanishes outside 
of some radius "d and is compensated, viz 



Ode u{e) = 



(5) 



The convolution of k with U to produce the Map map is 
a bandpass filter on the convergence map, with the filter 
function quite narr ow in (spatial) frequency space (Bartel- 
mann & Schneider 1999 ). Thus we may expect that for an 
appropriately chosen Map scale, t?, we will enhance the 
contrast of clusters in the map. 

We have created, from our noisy n maps, a series of A/ap 
maps with different filtering scales. We use the simplest. 



Map kernel 



TT 



lie <§. 



(6) 



Here u = 6/d and i? is the filter size. We perform this 
convolution directly on the k map, with Map filters for 
which I? is a multiple of the pixel scale. A typical cluster 
is ~ 10 pixels across, so pixelization effects are not too 
severe on the scales of interest. Convolution of the k map 
with this kernel indeed enhances the visibility of clusters 
in the maps as can be seen in Fig. |l|. Finally we produce 
5— statistic maps by dividing our Map maps by the rms 
fluctuation of the noise map. 

Using these maps as the input to our peak finding soft- 
ware we find that the efficiency and completeness depend 
on the filter size in the expected manner. Recalling that 
the Map filter scale is roughly 3x the Gaussian width for 
similarly extended kernels the results in Fig. |^ can be seen 
to be quite similar. In the presence of noise the 'optimal' 
filtering scale (2' — 4') is slightly larger than for the noise 
free map. One could argue that the low completeness lev- 
els we are finding are a result of our mass threshold, and 
that we would find a larger fraction of the higher mass 
clusters. This is in fact true, as the middle panel in Fig. |^ 
shows, but the price is a very low efficiency. In the middle 
panel we give the completeness and efficiency keeping only 
clusters above 3 x 10^'^h~^MQ (see also Fig. |l^). On bal- 
ance there is not a significant improvement compared to 
the lO"'^^/i~^M0 mass threshold. Finally we have also re- 
stricted ourselves to the clusters which lie in regions where 
the lensing kernel is more than half of its peak value. These 
clusters clearly have a larger probability of contributing 
to a significant peak than clusters near the observer or 
the sources. The completeness and efficiency numbers are 
given in the lowest panel of Fig. |^. While the completeness 
is everywhere higher than the top panel, the improvement 
is marginal. 

Fig. shows that even quite high 5— peaks can be ineffi- 
cient or incomplete. This does not mean that these peaks 
are purely noise however. In many cases a strong peak 
can be found to match to a lower mass object (or objects) 
along the line-of-sight. To illustrate this we have matched 
all groups above IQ^^h'^^MQ with all peaks having S > A 
in the 5 fields. Fig. ^ shows a scatter-plot of all the unique 
matches. Notice that there is a substantial tail of lower 
mass groups even for > 4. It is these low-mass groups 
that are driving our low efficiencies. 
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Fig. 7. — Completeness and efficiency as a function of 
tering scale for S— statistic maps with noise added as in Fig. 
(top) All clusters above 10^'^ Mq (middle) all clusters above 
3 X lO^*/i-lM0 and (bottom) clusters above 10^"- h-'^Mg with 
w > iDmax/2. Solid lines are the fraction of clusters in the 5 fields 
which are matched (completeness), dashed lines are the fraction 
of identified peaks which correspond to clusters (efficiency). The 
symbol types denote the S threshold for counting peaks: (squares) 
S > 1, (triangles) S > 3, (circles) 5 > 5. 
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Fig. 8. — A scatter plot of unique matches of all groups with 
A^200 > Mq and all peaks with 5 > 4 in our 5 maps with 

a filtering scale of ~ 3' . We have divided the line-of-sight into three 
intervals in distance. Open squares mark the closest 1/3, crosses the 
middle 1/3 and filled triangles the most distant 1/3 of the distance 
to the source. Note that there are a substantial number of objects 
below our chosen mass threshold (10^^/i~^Mq) for clusters. 
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Our low efficiency is caused by 'noise' coming from both 
intrinsic ellipticities and from cosmic structures. The mea- 
sured mass function will therefore be biased toward the 
most massive objects. This problem could be overcome 
if we knew the mass Probability Distribution Function of 
those noise peaks, which we could try to deconvolve from 
the measured mass function. Unfortunately, only the noise 
peak S'-distribution is known, as this can be derived an- 
alytically given the cUipticity dispersion (van Waerbeke 
2000). It is interesting to ask whether this problem could 
be suppressed with an arbitrary low noise mass map, such 
as could be obtained by observing fields with a very long 
exposure time. 

To answer this we measured the mass function associ- 
ated with peaks in the noise-free fields which do not match 
any real cluster above 10^'^h~^MQ. We took peaks with 
positive K - the mass function would be higher if we low- 
ered this threshold or included any noise, it would be lower 
if we raised the threshold. In order to ass ign a mass to a 
fake peak, we followed Erben et al. (2000) and computed 
the minimum mass that would naively be associated with 
the peak. This mass corresponds to the deflector being 
at the redshift where the lensing kernel peaks (z ~ 0.5 in 
our case) and is a conservative way for observers to give 
a lower mass to what they believe to be a real cluster. 
Fig. ^ shows this minimum mass function of the missed 
peaks (empty symbols) compared to the true mass func- 
tion (filled circles). The estimated mass depends of course 
on the aperture one uses. The three sets of empty sym- 
bols show the mass function obtained using three differ- 
ent aperture sizes: we used a square of side 2.1', 2.8' and 
3.5', corresponding respectively to 0.7, 1, and 1.2 /i~^Mpc 
radius at the assumed lens redshift of 0.5. This choice 
matches the typical aperture size used in the literature, 
and correspond to the virial radii used to compute most 
of the masses in this work (Fig. ||). We see that the non- 
matched peaks are non-negligible below 3 x lO^^/i~^M0. 
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Fig. 9. — The mass functions of 'real' and 'missed' peaks in 
our noise-free maps. Filled circles show the true mass function. 
Empty symbols show the minimum estimated mass of the missed 
peaks (see text) which do not match a true cluster with mass above 
10^'^h~^ Mq. The mass of these peaks is computed in a square box 
of width 2.1' (triangles), 2.8' (squares) and 3.5' (diamonds). 



This means that even in the ideal case of noise free data, 
a large number of "clusters" in the range Mq, 
3 X 10^^h~^MQ] are only projections of large-scale struc- 
ture. Lowering the mass threshold does not significantly 
change Fig. ||. 

Phrased another way, the presence of a distribution 
of halos and large-scale structures provides a fluctuating 
background to our k maps. The clusters we are seeking are 
embedded in this background, which has a similar effect 
on the mass function (broadening it) as does measurement 
noise. However this broadening depends on the r.m.s. of 
the mass map, which depends on the cosmological model 
one considers. Thus in order to recover the cluster mass 
function by deconvolution we would need to assume an 
underlying cosmological model. 

Also note that Fig. ^ suggests that it is possible to ob- 
tain peaks in a lensing map which can be interpreted at 
structures as massive as a few IO^'^H'^Mq due simply to 
projection effects. This may bear upon the reports of 'dark 
clumps' with masses around a few 10^'*/i~^Mq by (Fis- 

and Umetsu & Futamase 



chcr 



199c, Erben et al. 2000 



2000). In these papers, the authors computed the proba- 



bility for the S peak to be 'real' against random alignment 
of lensed galaxies, but they neglected the effect of projec- 
tion of lower mass clumps discussed here. 

We show in Fig. |l^ that the efficiency and complete- 
ness are not equally distributed among the different cluster 
masses. There is a trend to have higher completeness for 
higher cluster masses, as expected. However our complete- 
ness is still not 100% even for very massive clusters. In our 
5 fields there are a handful of massive clusters with z ~ 1 
and thus a low lensing efficiency. These do not pass our 
3(7 threshold in Fig. |l^. We caution that at the high mass 
end we have relatively few clusters in our maps and in the 
underlying simulation volume. We thus become very sensi- 
tive to fluctuations both in the number density and in the 
redshift distribution of these objects. Thus these numbers 
should be taken as suggestive of some of the issues which 
can affect completeness with the caveat that they should 
be checked with a larger simulation before being used to 
'correct' any data. 

We show the correlation between distance and S in 
Fig. In]. This correlation is easy to understand: at a fixed 
mass low S clusters correspond preferentially to a low lens- 
ing efficiency, thus they are at either high or at low red- 
shift. Since the cosmological volume is much larger at high 
than at low redshift, we naturally catch more high redshift 
clusters. This also points out a problem in using a fixed 
S threshold for cluster selection: it will affect the redshift 
selection function, and this should be properly taken into 
account in cluster abundance analysis for instance. 

4. COMPARISON WITH PREVIOUS WORK 

Ours is not the first study to investigate how complete 
a weak lensing survey of clusters could be using numerical 
simulations. It does however improve upon early work in 
some respects. The closest predecessor to our w ork, i n 
terms of focus, is that of Reblinsky & Bartelmann (1999). 

These authors used N-body simulations to look at pro- 
jection effects in weak lensing selected cluster samples, and 
compared it to the projection effects in richness selected 
clusters. They generated a 3D cluster catalog from a sin- 
gle output of one of the GIF simulations in an 85/i~^Mpc 
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box. Their weak lensing maps were constructed by taking 
a projection of the mass in the box, constructing the 2D 
lensing potential to calculate the shear, and use aperture 
mass (Map) methods to find peaks. Since we start with a 
simulation volume 44 times larger (containing many more 
clusters) and simulate the entire ~ 2000ft.^^Mpc line-of- 
sight, not just one 85/i~-^Mpc piece of it, we could study 
projection effects over cosmological scales. It turns out 
that this exacerbates the projection effects seen in their 
work and makes our results slightly more pessimistic than 
theirs. 

Reblinsky & Bartelmann ( 1999| ) worked only at 2'. Our 
efficiency /completeness at 2 (Fig. |^) looks similar to their 
result, although it is difficult to com pare t he results di- 
rectly since Reblinsky & Bartelmann ( |I999 ) give only cu- 
mulative numbers. Both sets of results can be explained 
by the leakage of small mass peaks to higher masses due to 
the convolution of the true cluster mass function with the 
'noise'. This predicts that we should observe more high 
mass clusters than there really are, in accord with our re- 
sults. Here we have demonstrated how this result depends 
on the smoothing kernel used, with the dependence being 
non-trivial. In particular we found that beyond a given 
scale (which is kernel dependent) increasing the smooth- 
ing scale always pushes the efficiency close to 100% (but 
the price to pay is a low completeness). 

Simulations which include the entire line-of-sight have 
been performed by Reblinsky, Kruse, Jain & Schneider 
( 1999| ), foll owing up the semi-analytic work of Kruse & 
Schneider (1999). These authors made "Map maps" from 



simulated shear maps, with and without adding random 
noise. From these they extracted the number density of 
Map peaks with S/N > 5 (where for the map with no noise 
they estimated the noise due to intrinsic galaxy elliptici- 
ties). They find quite good agreement with the analytic 
estimates, in that 5 peaks per square degree are found 



above this detection threshold for the map with no noiseQ 
Unfortunately, due to the lack of knowledge of the 3D clus- 
ter positions in their analysis, it was impossible for them 
to study the completeness/efficiency. This makes it im- 
possible to compare directly with our work. 

5. DISCUSSION 

In the last few years it has become possible to search for 
clusters of galaxies directly as mass enhancements using 
weak gravitational lensing. This method probes the mass 
of a cluster independent of its dynamical state, and thus 
presents a different view to surveys based on galaxy counts 
or the intra-cluster medium. For this reason a lens selected 
survey of clusters is an appealing sample, which could in 
principle be mass selected allowing a reconstruction of the 
cluster mass function with redshift. 

Unfortunately there are many obstacles to be overcome. 
Our study strongly implies that complementary observa- 
tions (both weak and strong lensing, optical, Sunyaev- 
Zel'dovich, X-ray) will be of great help in cleaning a sample 
of lensing selected clusters of spur ious detectio n and pro- 
jection effects (e.g. Castander 2000; Bartelmann 2001). An 
example of this is the confirmation using redshifts of the 
lensing selected cluster by Wittman et al. (2001). This will 
probably involve a significant number of foUowup observa- 
tions on lensing (mass) preselected clusters. In this paper 
we have begun to address the problem of designing such 
a survey, depending on the different goals (completeness, 
cluster redshift, mass range) one wants to achieve. 

Ext ending upon the wo rk of Reblinsky & Bartelmann 
( |199S| ) and Metzler et al. ( |1999| ), we studied the problems 
associated with selecting clusters in lensing data, using 
larger numerical simulations of clusters which simulated 
the entire past light-cone. We focussed on the aperture 

^Note: this does not imply that the survey is complete, simply 
that the analytic estimate produces the same fraction of identified 
clusters as the simulation. 
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Fig. 10. — Efficiency and completeness as a function of mass for 
maps with a ~ 3' filtering scale. Solid lines show completeness, 
dashed lines efficiency. Solid squares are for S > 3, triangles S > A 
and open circles S > 5. The statistics at the high mass end become 
very poor. 



Fig. 11. — A scatter-plot showing distance and S for all clusters in 
our 5 fields with mass above 2 X IO^^H-'^Mq and S > 1. The more 
distant clusters, having a lower lensing efficiency, have lower S at 
fixed mass. The y axis runs from ^ = to z = 1, with z = 0.3 being 
^ 850/i~^Mpc from the observer and z = 0.5 being 1300/i^^Mpc. 
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mass statistic, assuming one identified source population, 
and investigated its dependence on cluster mass and red- 
shift. We compared the catalogs produced by different 
lensing-based analyses to the reference set of 3D clusters 
present in the simulation. 

As discussed before, measurements of the masses of in- 
dividual clusters from weak lensing have a large scatter 
(100%) and a significant bias (about 20%), for clusters 
more massive than 10^'^h~^ Mq (see Fig. The bias 
comes from the fact that clusters live preferentially in 
larger structures. The large scatter is due to the pres- 



ence ( pf a large number of haloa of different maaaca mab 



ing u p tho largo ecalo structure. — Phracod in torme of a 
mass, the 'noise' induced by this structure is comparable 
to the signal from a cluster of 10^^h~^MQ put at redshift 
of ^ 0.5. This may bear upon the existence of dark clus- 
ters with mass of a few 10^*h~^MQ in lensing surveys, 
however to be sure whether projection effects are the ex- 
planation would require simulating the distribution of the 
light (i.e. galaxies) under the same observational condi- 
tions. If light suffers less projection than mass, this may 
explai n the dark clusters. In agreement with Hoekstra 
(2001) we find that the noise due to uncorrelated large- 



scale ;;tructurc along the linc-of-sight does not obscure the 
signal from sutticicntiy massive (~ h^^ Mq) clusters. 



Th 



cT 



ip 'mass scatter' biases the mass function measur 
with lenyiiig. The ycaLLer contains an InLi'lnsic coiiLi'lbu- 
tion, which is cosmological model dependent, and a mea- 
surement noise contribution determined by the intrinsic 
galaxy ellipticities and the smoothing kernel. It is possi- 
ble to model the noise in the S statistic (and to predict 
the number density of observed ,9 peaks as in Reblin 



isky 



et al. i l99g|'). for instance), but it is much more difficult 



to transpose this S'-noise into a mass noise since S is not 
simply related to the mass (see Fig. ||), and this relation 
depends on the cosmological model. 

Even if we restrict ourselves to the methods explored 
here, lensing surveys should be relatively complete for the 
highest mass clusters, (^ 3 x l{y^^h~^MQ) with reasonable 
efficiency (0.1 — 0.5 if S" > 5 for example). However it is 
important to take into account the variation of the lensing 
kernel with redshift in interpreting the mass threshold of 
an S-selected sample. 

It is not clear to what extent these issues can be over- 
come. In this work, we have not made use of filters 
matched to the cluster profiles or of multiple source popu- 
lations. In principle incorporating either of these could in- 
crease the efficiency or completeness of our samples. How- 
ever, at the very least these issues have to be considered in 
projects aimed at performing a statistical measure of clus- 
ter masses from weak lensing. In particular the trade-off 
between completeness/efficiency and mass bias are impor- 
tant aspects for plans to measure the low er mas s clumps, 
lik e grou ps of galaxies (Schneider & Kneib 1998 ; MoUer et 
al. |200l|). 
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